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Abstract 

This paper describes an exponential transient excision algorithm (ETEA). In biomedical time series 
analysis, e.g., in vivo neural recording and electrocorticography (ECoG), some measurement artifacts take 
the form of piecewise exponential transients. The proposed method is formulated as an unconstrained 
convex optimization problem, regularized by smoothed ^i-norm penalty function, which can be solved by 
majorization-minimization (MM) method. With a slight modification of the regularizer, ETEA can also 
suppress more irregular piecewise smooth artifacts, especially, ocular artifacts (OA) in electroencephalog¬ 
raphy (EEG) data. Examples of synthetic signal, EEG data, and ECoG data are presented to illustrate 
the proposed algorithms. 


1 Introduction 

This work is motivated by the problem of suppressing various types of artifacts in recordings of neural activity. 
In a recent study [25) . typical artifacts in in vivo neural recordings are classified into four types (Type 0 to 
3, see Section 2.2 and Figure 3 in [2S]). This classification covers many artifacts in the scope of human brain 
activity recordings, e.g., electroencephalography (EEG) and electrocorticography (ECoG). In this paper, we 
consider the suppression of Type 0 and Type 1 artifacts. For the purpose of flexibility and generality, we 
redefine them in terms of morphological characteristics: 

• Type 0: a smooth protuberance that can be modeled as x{t) = when t ^ to- 

• Type 1: an abrupt jump followed by an exponential decay that can be modeled as x(t) = e““k when 
t^to. 

Figure [1] shows examples of the two types of artifacts. We do not consider the other two types in this work 
because our previous works have addressed efficient algorithms to remove such artifacts. For instance, low- 
pass filtering/total variation denoising (LPF/TVD) [61] suppresses Type 2 artifacts (Figured!:), and lowpass 
filtering/compound sparse denoising (LPF/CSD) [601161] can remove sparse and blocky spikes (Type 3 shown 
in Figure dJi). 

The approach proposed in this paper is based on an optimization problem intended to capture the primary 
morphological characteristics of the artifacts using sparsity-inducing regularization. To formulate the problem, 
we model the observed time series as 


y(.t) = fit) + x{t) + w{t), (1) 
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Figure 1: Examples of (a) Type 0 artifact, and (b) Type 1 artifact, and (c) Type 2 artifact, and (d) Type 3 
artifact. 


Table 1: Sparsity-promoting penalty functions (a > 0). 


Penalty 

abs 

log 

atan 


4>{u) 


ilog(l-|-a|K|) 



1 + 2a\u \X 

~7^J 


<t>^{u) 


y/v? + e 

- log(l + ay/V? + e) 


ay/i 


tan 


1 + 2ay/v? + e 


b(u) = u/(j>[{u) 
y/v? -h e 

y/u^ -I- ay/ 

^1 y/u'^ -I- e (^1 -t ay/u'^ + e + a'^{u'^ + e)j 


where / is a lowpass signal, a; is a piecewise smooth transient signal (i.e.. Type 0 or Type 1 artifacts), and 
w is stationary white Gaussian noise. More specifically, / is assumed to be restricted to a certain range of 
low frequencies. In other words, H(/) Ri 0, where H is a high-pass filter. Note that in the signal model 
©, conventional LTI filtering is not suitable to estimate either / or x from y, because component x, as a 
piecewise smooth signal comprised of transients, is not band limited. 

In order to estimate the components, we combine LTI filtering with sparsity-based techniques. We formu¬ 
late an optimization problem for both decomposition and denoising. A computationally efficient algorithm is 
derived to solve the optimization problem, based on the theory of majorization-minimization (MM) [T51IM1I55] . 

In addition, this paper specifies how to generate a smoothed penalty function and its majorizer from 
a non-smooth one, in order to overcome a numerical issue that arises when the penalty function is not 
differentiable. 

1.1 Related works 

Some recent works recover signals with transients by various algorithms. In |19| . a slowly varying signal is 
modeled as a local polynomial and an optimization problem using Tikhonov regularization is formulated to 
capture it. In |47| . the slowly varying trend is modeled as a higher-order sparse-derivative signal (e.g., the 
third-order derivative is sparse). 
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Instead of estimating the slowly varying component via regularization, the LPF/TVD method [5T] esti¬ 
mates a lowpass component by LTI filtering and a piecewise constant component by optimization. In this 
case, an optimization problem is formulated to estimate the piecewise constant component. The approach 
proposed here uses a similar technique to recover the lowpass component, but in contrast to LPF/TVD, it 
is more general — the regularization is more flexible with a tunable parameter, so that LPF/TVD can be 
considered as a special case. 

Another algorithm related to the approach taken in this paper is the transient artifact reduction algorithm 
(TARA) which is utilized to suppress additive piecewise constant artifacts and spikes (similar to a hybrid 
of Type 2 and Type 3 artifact). The approaches proposed in this work target different types of artifacts 
(Type 0 and Type 1) and applied in different applications. 

The irregularity of Type 0 transients leads to a more complicated artifact removal problem, where the 
artifact are irregular fluctuations. A typical example in EEG is ocular artifacts (OA) caused by the blink 
and/or movement of eyes. To suppress OA, there are approaches based on empirical mode decomposition 
(EMD) [m 1321 SSI EH] ! and on independent component analysis (ICA) methods [2 [T21 HOI HSl EZl SH] • The 
concept of spatial-frequency in acoustic analysis is also used to remove OA from multichannel signals [331 El] ■ 
In this work, we present a new method to suppress ocular artifacts by proposing a specific model and using 
sparse optimization. 

This paper adopts a regularizer inspired by the generalized I-D total variation m, wherein the derivative 
operator in conventional total variation regularizer is generalized to a recursive filter. The regularizers adopted 
in ETEA and second-order ETEA coincide with first-order and second-order cases of generalized I-D total 
variation, respectively. Some differences to the problem discussed in [27] are as follow. Eirstly, the signal model 
(HD allows a lowpass baseline as a component, hence, ETEA can be seen as a combination of conventional 
LTI filtering and generalized I-D total variation. Secondly, we consider a formulation in terms of banded 
matrices, for computational efficiency. Thirdly, we give optimality conditions of the proposed problems, and 
use these conditions as a guide to set the regularization parameters. 


2 Preliminaries 


2.1 Notation 

We use bold uppercase letters for matrices, e.g., A and B, and bold lowercase letters for vectors, e.g., x and 
y. We use column vectors for one-dimensional series. Eor example, a vector x G is written as 

x= [x(0), a:(l), ••• , a:(A-l)]'^ (2) 

where [ • denotes matrix transpose. The ^i-norm and squared .^ 2 -norm of x are defined as 


The inverse transpose of a matrix is denoted as A The first-order difference operator D of size {N — l)xN 
is 


-1 


D := 


1 

-I 


I 


(4) 


-1 1 
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We use f{x; a) to denote a function of x determined by parameter a, to distinguish it from a function with 
two ordered arguments, e.g., f{x,y). 


2.2 LTI filter with matrix formulation 

A discrete-time filter can be described as 


Ofc yin -k)='^bk xin - k), 

k k 

with transfer function H{z) = B{z)/A{z). For a finite length signal, ([5]) can be rewritten as 


(5) 


Ay = Bx, 


( 6 ) 


where A,B G are both banded matrices. When H is a zero-phase filter with a-k = ak and b-k = bk, 

then A and B are both symmetric Toeplitz matrices 


A = 


Oi Oo 
Oq Oi Oo 


B = 


bi 

bo 


bo 

bi 


ao oi oo 
ao ai 


bo bi bo 
bo bi 


(7a) 


(7b) 


In this case, disregarding a few samples on both edges, y can be re-written as: 


y = BA^^x = Hx. 


( 8 ) 


A detailed description of such zero-phase filters is given in Section V of Ref. EH, where two parameters, d 
and /c, determine the order of the filter and the cut-off frequency, respectively. In this paper, B is a square 
matrix, in contrast to m- The simplest case (d = 1) is given by ([7]). 


2.3 Majorization-Minimization 

Majorization-Minimization (MM) [151 l35l [57] is a procedure to replace a difficult optimization problem by a 
sequence of simpler ones [HJ [Ml 133 [SI]. Suppose a minimization problem has a objective function J : ^ 

K, then the majorizer G(u, v) : K.^ x — >■ K satisfies 

G(u, v) = J(u), when u = V, (9a) 

G(u,v) > d(u), when u ^ v. (9b) 
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Figure 2: (a) Convex non-smooth penalty function and corresponding smoothed penalty function, (b) Non- 
convex and smoothed penalty functions, (c) Majorizer (gray) of the smoothed penalty function in (a), (d) 
Majorizer (gray) of the smoothed non-convex penalty function in (b). For all the figures we set e = 0.05, v = 
0.5, a = 2. 


Then MM iteratively solves 


= argminG(u, u(^')) (10) 

U 

until convergence, where k is iteration index. The detailed derivation and proof of convergence of MM are 
given in [3 5) . 

3 Majorization of smoothed penalty function 

When using sparse-regularized optimization to recover a signal, the £i-norm is widely utilized. To further 
enhance sparsity, some methods use iteratively re-weighting procedures [H 13411451163j , or a non-convex pseudo 
.^p-norm (0 < p < 1), or a mixed-norm in the regularizer [51 168115511581133) . Other non-convex penalty functions 
can also be used. In [46] a logarithm penalty function is discussed (‘log’ function in Tabled]), and in [59], an 
arctangent function is used (‘atan’ function Table [ij- However, each of these functions is non-difterentiable 
at zero, where 4>'{0~) ^ ((>'(0“*'). A method, to avoid the discontinuity in the derivative, is to smooth the 
function. 

Consider the smoothed function I'M. ^ IR+ 

(fieiu) := (j){p{u;e)), (11) 

where function p : K. —>■ M+ is defined as 

p{u; e) := y/u'^ + e, e > 0. (12) 
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Because + e is always greater than zero, </>£ has a continuous first-order derivative 


</>£(■«) = ^ -He). (13) 

-I- e 

Table [T] gives the smoothed penalty functions corresponding to (j) in the first column, and Figure [2jab) 
illustrates the penalty function and its smoothed version. The parameter e controls the similarity of (j)^ to 
non-smooth function </). When e —>■ 0, ~ 4‘i'U')- In practice, we set e to a very small number (e.g. 10“^°). 

3.1 Majorization of smoothed penalty function 

We assume the non-smooth function ip satisfies the following conditions: 

1. </>(«) is continuous on R. 

2. is twice continuously differentiable on R\{0}. 

3. (/>(u) is even symmetric: (/>(u) = (p{—u). 

4. (/>(u) is increasing and concave on R+. 

Under such assumptions, we find a quadratic function g : R x R —>■ R, 

= + (j){v)-^(p'{v), (14) 

majorizing cp when v ^ 0 m Lemma 1]. However, (p might not be differentiable at zero, e.g., all functions (p 
in Table [T] To avoid this issue, we use (HH to ensure the objective function is continuous and differentiable. 
Then the MM method can be applied without the occurrence of numerical issues (e.g., divide by zero). The 
following proposition indicates a suitable majorizer of the smoothed penalty functions. 

Proposition 1. If function cp satisfies the listed four conditions in the previous paragraph and g in (HI is a 
majorizer of (j), then 


9e{u, v] (pf) = g{p{u; e), p(u; e); </>), 


(15) 


is majorizer of (p^{u) = (p{p{u;e)) foru,v € R, and if writing explicitly, then g^ in (I15|) is 


11^ V 

9e{u, v; (pe) = -f (pe{v) “ , 


(16) 


where ip{v) = v/(p'^{v) 0. 

The proof of Proposition [T] is given inO Figure [2j: and Figure [2ji give examples of using function (IT6l) to 
majorize the smoothed penalty functions in Figure [^i and Figure [5 Jd, respectively. 
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4 Exponential transient excision algorithm 

To formulate the problem of exponential transient excision, we define R as 


—r 1 


R 



—r 1 


(17) 


The first-order derivative operator D is a special case of R with r = 1. In this paper we restrict 0 < r < 1 
to distinguish them. As a filter, R can be seen to be a first-order filter attenuating low frequencies, with a 
transfer function 


R[z) := I - rz-^. (18) 

Driving the system R{z) with the step exponential 

0 , 

^n—riQ 

produces an impulse 6{n — uq) as an output. If input signal x is a step exponential with rate r, i.e., a Type 
1 artifact, then 

V = Rx (20) 

will be sparse. 

Note that r should be known beforehand. In practice, we can estimate it from the time-constant of a Type 
1 artifact. Using observation data y, if we can measure the time Nq over which a Type 1 transient decays to 
half its initial height, then r can be found by solving =0.5. We ignore the influence of the slowly varying 
baseline, as we assume the exponential decays much faster. If the transients have an approximately equal 
decay rate, then we can use an average measured form multiple transients. 

4.1 Problem definition 

In signal model ([T]), if an estimate x is known, then we can estimate / by 

f = (y-x)-H(y-x), (21) 

where H = is a highpass filter, where the A and B matrices are both banded with a structure as (I7|). 

Correspondingly, noise w is y — (x-hf) = H(y —x), which indicates a suitable data fidelity term ||H(y —x)|||. 
Moreover, when the component x is modeled as cm, we can use v = Rx as its sparse representation. Hence, 
we propose to formulate the estimation of x from noisy data y as the unconstrained optimization problem: 

X* = arginin|Pi(x) = ||H(y - x )||2 4- A ^ ^d[R-x]„)|, (22) 

n 

where A > 0 is the regularization parameter related to noise level. 



n < no, 
n ^ no, 


(19) 
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Table 2: Exponential Transient Excision Algorithm. 


Input: y, A, r, A, B 
Initialization: x € 

b= B^BA-V 

Repeat 

V = Rx 

= 2^(u(n)) 

Q = B^B + A^R^ARA 
X =AQ^ib 
Until convergence 
f = (y - x) - H(y - x) 
Return: x, f. 


4.2 Algorithm derivation 

Using the majorizer of smoothed penalty function (IT6l) . and recalling that H = BA”^ illustrated in Section [2l^ 
we majorize the objective function 


G(x,z;Pi) = ||H(y-x )||2 + A^gd[Rx]„,[Rz]„;(/)d 

n 

= ||BA-i(y-x)||2+x'rR'r[A(Rz)]Rx + G (23) 


where C does not depend on x, and [A(Rz)] € i)x(Af i) jg ^ diagonal matrix 

[^(R-^)]n.n= 2^([Rz]„) 

where := which is listed in Table [TJ Then the minimizer of (I23|) is given by 


(24) 


= A^^B^BA-i+RT[A(R z)1R A^^B^BAV 


(25) 


Note that, calculating (j^ directly requires the solution to a large dense system of equations. However, 
since A is invertable, we can factor A out and write (l2^ as 


xU+i) = a[b'^B + A"^R'^[A(Rx('=))]Ra] B'^BA^V, 


(26) 


where the MM procedure (fTUll is adopted with z = x^^^. Note that the matrix to be inverted in (1^ is banded, 
so that the solution can be computed efficiently by fast banded system solvers (e.g., [MIES])- Moreover, in 
(IMl) . all matrix multiplications are between banded ones. Table |2] summarizes the algorithm to solve 


4.3 Optimality condition and parameter selection 

For problem it is difficult to derive optimality conditions directly. Here, we consider the optimality 

condition indirectly via an equivalent problem, similar to the discussion of optimality condition for total 









variation problems in [lin] and for LPF/TVD in [ST]. 

To facilitate our derivation, we firstly denote the optimal solution of by x*. We then define v* = Rx* 
where R is given by (ITTl) . We define G G 


0 

1 


G := 


0 

1 0 

r 1 0 


r.N-3 


r.N-2 


r 1 0 

r 1 


(28) 


which satisfies 


RG = I. 


(29) 


The operator G acts as an inverse filter of R. Using the above assumptions, we derive the following proposition 
regarding optimality conditions for (1221) . 

Proposition 2. If k* is an optimal solution to (|22|) . then v* = Rx* is an optimal solution to problem 

V* = argminQ(v), (30) 

V 


where 


Q(v) = ||H(yo-Gv)|l2 + A^().,([v]„). 

n 


(31) 


and yo = y — (x* — GRx*). 
Proof. We define xq G as 


xo := Gv* = GRx* (32) 

where x* is the optimal solution to (1221) . Note that xq and x* share an identical v* = Rxq = Rx*. Moreover, 
we define the difference between x* and xg as 


d X* — xg. (33) 

It follows that Rd = Rx* — Rxg =0. As a consequence, Xg is the optimal solution to problem 

Xg =argminPi(u) (34a) 

X 

s.t. X = u — d. (34b) 

Note a::o(0) = 0 must be satisfied in (IMl) because of the definition of G. Thus, we can write an equivalent 
problem to (IMl) using (1^ . 


{xg,v*} = argminPi(u) 


(35a) 




s.t. X = u — d 
x= Gv 


(35b) 

(35c) 


where {xo,v*} must be the optimal solution. In problem (I35L u is uniquely determined by linear functions 
of V and d, so problem (1351) can be simplified by substituting the variables, 

{xo,v*} = argmin(5(v) 

X,V 

s.t. X = Gv 

where Q(v) = Pi(u), with u = Gv + d, and can be written explicitly, 

Q(v) = ||H(y - Gv - d)||2 + <(.,([RGv + Rd]„). (37) 

n 

Because RG = I and Rd = 0, we simplify (l37)l as 

QW = ||H(yo-Gv)||^ + A^0,([v]„) (38) 


(36a) 

(36b) 


where yo = y - d. 

In this case, the equality constraint in (1361) is redundant. We can solve the unconstrained problem 
(1501) first, and then compute xg by (1551) . This implies that v* is the optimal solution to (1501) and satisfies 
VQ(v*) = 0. □ 

Using Proposition[2l instead of problem (1221) . we alternatively consider the optimality condition of problem 
(1501) where as long as x* is an optimal solution of (1551) . v* is an optimal solution of (1501) . Moreover, we can 
rewrite the optimality condition of (|30|) as 

p{n) = A</)'([v*]„) = A<()'([Rx*]„), (39) 

where p = 2GTHTH(yo — xq), which can be rewritten as 

p = 2G'^H'^H(y-x*). (40) 

Note that, writing p as (1401) does not require the solution to problem (1501) or to compute d and yg. Therefore, 

although we derived an indirect way to verify the optimality condition for (I22L the final procedure is direct. 

Setting parameter A. The equation (|59)) can be used as a guide to set the regularization parameter A. 
Suppose the observation data is composed of Gaussian noise only, then a proper value of A should make the 
solution of (l22l) almost identically zero. Thus, its sparse representation v*, given by (l20l) . should be all zero 
as well. In this case, we can calculate a vector q, which depends only on noise, 

q = 2 G'^H'^Hw. (41) 

When p is calculated by (I40L we find a constraint that —A < p{n) < A, by the property of <j)^. Furthermore, 
since x* r; 0 is expected when the observation is pure noise (i.e., y = w), the values of q Ri p can be 
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Figure 3: Example 1: Simulated data comprising a lowpass component and exponential transients, 
considered bounded 


q{n) € [—A,+A], when |u(n)| ^ /Se, for n € Ijn (42) 

with high probability. The value here is related to e, and since e is extremely small, it can be simply 
assumed to be 0. As a consequence, the value of A needs to satisfy 

A > max{|2[G'rH'rHw]„|,n G Zat}. (43) 

Further, if the statistical property of the noise can be exploited, e.g., w{n) ^ A/’(0,ct^), A can be set statisti¬ 
cally, such that 


A = 2.5a^|l2hi||2 (44) 

where is the standard deviation of the noise and hi is the impulse response corresponding to system 
G^HTH. Note that, G^HTR is a filter with a transfer function Pi{z) = H‘^{z)/R{z). Hence, although we 
give the matrix G to derive the approach to set A, it is not required to compute hi because the filter Pi{z) 
can be implemented directly using H{z) in ([8]), and R{z) in (fTSl) . 

4.4 Example: synthetic signal 

To illustrate ETEA, the algorithm is tested on a simulated signal and compared to other methods. Figure [3] 
shows the test signal and its components: a lowpass signal /, a transient component x, composed of three 
exponential decays, and white Gaussian noise with = 0.20. The noisy observation is shown in Figure [3] in 
gray. 

Denoising. The result of conventional lowpass filtering is shown in Figure IHi. The result is smooth, but 
the step edges of each pulse are diminished. Moreover, if compared to the true low-pass baseline, the result 
is also distorted. The result of using LPF/TVD [5T] is illustrated in Figure IUd. Although the abrupt jumps 
have been preserved, serious staircase effects appear. LPF/TVD uses the first-order derivative operator D in 
regularization. The derivative of an exponential is another exponential, therefore the derivative is unlikely to 
have a sufficiently sparse representation. A compensative method is to widen the bandwidth of the lowpass 
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Figure 4: Example 1: The denoising results of (a) lowpass filtering, (b) LPF/TVD, (c) ETEA (proposed 
method). 

filter, however, this leads to a noisy result. The result presented in Figure IH d is selected by tuning all necessary 
parameters of LPF/TVD to optimize RMSE (root-mean-square error) value. 

The result of the proposed ETEA method is shown in Figure HJ:. The step edges and decay behavior 
are well preserved (RMSE = 0.057). We use a smoothed £i-norm penalty function. In this example, we 
set e = 10“^°, r = 0.94, filter order parameter d = 1, cut-off frequency fc = 0.013 cycle/sample, and 
regularization parameter is calculated by (1441) based on the noise principle. 

Decomposition. Wavelet-based methods for artifact correction have been described in [11 [2^ l40l 156) . 
In Figure [5^., we illustrate denoising and decomposition results obtained using the stationary (un-decimated) 
wavelet transform m with Haar wavelet using hard-threshold determined by the v^2 log Na^i thresholding 
scheme. The denoised result is further enhanced by the artifact-free denoising method [TSKII] which uses total 
variation minimization to overcome pseudo-Gibbs oscillations. Although the denoising output is relatively 
smooth and captures the discontinuities, the decomposed components are both distorted. Compared to the 
true components in Figure |31 the lowpass subband deviates from true lowpass component (especially at about 
n = 100). The transient component, reconstructed from other subbands after denoising, has to compensate for 
the error. Hence, the estimated transient component does not have a zero baseline (see bottom of Figure [5^). 

Non-convex penalty functions can be used to improve the result of ETEA. Here we use the smoothed 
non-convex logarithm penalty function in Table [T] (a = 2). The filtering result is shown in Figure [Dd, where 
discontinuities are more accurately preserved, and the RMSE is reduced to 0.043 compared with the result 
in Figure [4]:. We also illustrate the decomposed / and x components in Figure [5]3. ETEA recovers both of 
the components accurately. The algorithm run-time is not affected by the choice of penalty function. In this 
example, 50 iterations of the algorithm takes about 25 ms on a MacBook Pro 2012 with a 2.7 GHz GPU, 
implemented in Matlab 8.4. 
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Figure 5: Example 1: Comparison of denoising and decomposition results using (a) Wavelet-based method 
and, (b) ETEA with non-convex regularization. 

4.5 Example: artifact removal of ECoG data 

Conventional EEC studies and most clinical EEC applications are restricted below 75 Hz [51]. Advanced 
measuring methods such as ECoG records multichannel cortical potentials from the micro-electrodes located 
inside human skull with a higher sampling rate. In this example, we use a 5-second ECoG signal epoch, with 
a sampling rate of 2713 Hz, recorded from a mesial temporal lobe epilepsy (MTLE) patient. 

Figure [6^ shows the raw data in gray and the corrected data by ETEA in black. There are two Type 1 
artifacts identified in this 5-second epoch. One is at about t = 1.20 second, and the other is at about t = 3.60 
second. In signal model o, suppressing component x which represents artifacts, the corrected data (/ -I- w) 
should preserve the components. We show the corrected data in Figure [6^, where the two Type 1 spikes are 
correctly removed. The decomposed artifact is illustrated in Figure |6^ as well, which adheres a zero-baseline. 

Since wavelet-based methods have been successfully applied to suppress artifacts [nnsiiiD]. a comparison 
with a wavelet-based method is shown in Figure [H d. We use the stationary (un-decimated) wavelet transform 
m with Haar wavelet filter and the non-negative garrote threshold function [18], as recommended in [25] . 
The thresholding has been applied to all the subbands except the lowpass band, and the artifact component 
is obtained by subtracting the corrected data from the raw data. As shown, this approach estimates transient 
pulses but the estimated pulse adheres to the shape of the Haar wavelet filter: a positive-negative pulse (see 
the bottom square box in Figure [6]3), but not the true Type 1 artifact in Figure [T]3. 
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Figure 6: Example 2: Comparison of artifacts removal for ECoG data, (a) ETEA (proposed method), (b) 
wavelet-based method. 

To more clearly illustrate the estimated results by the two methods, we show the details of the corrected 
data and the artifact from t = 3.50 to 3.65 second in dashed line boxes in Figure[T^ and FigureUt, respectively. 
Since the wavelet-based method cannot correctly estimate Type 1 artifact in this case, a ‘bump’ can be 
observed in Figure |6l3 to compensate the error. Moreover, this inappropriate estimation will cause the 
corrected data to change before Type 1 artifact actually occurs (see Figure [TJr). In contrast, ETEA estimates 
the artifact as the abrupt drift with a decay, then it exicises the artifact without influencing the data before 
the transient occurs (see Figure [T^). 

In this example, for the signal with a length of 13565 samples, the proposed algorithm converges within 
40 iterations, and takes about 330 ms on a MacBook Pro 2012 with 2.7 GHz GPU, implemented in Matlab 
8.4. The cost function history is shown in Figure [T] 
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Cost function history 



Figure 7: Example 2: Cost function history. 



Figure 8: Example 3: The simulated test data, generated by driving an AR filter with a sparse sequence 
and adding noise. 

5 Higher-order ETEA 

In the previous section, we have presented ETEA based on signal model ([T|), suitable for Type 1 artifacts, 
where the component x has discontinuous step exponential transients. Here we illustrate another version of 
ETEA based on signal model ([T]), where the component x models Type 0 artifacts. 

Eirst, we consider the operator 


—2r 1 


R2 := 



(45) 


—2r 1 


where R 2 € non-zero coefficients in each row. As a special case, when r = 1, R 2 is the 

second-order difference operator, and ||R 2 x||i is the same as the regularizer used in [3D] for £i detrending. 
Taking R 2 as a hlter, the transfer function is 

R2(z) = (l-rz-^f, (46) 
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Figure 9: Example 3: Denoising and decomposition results by (a) EMD based denoising (EMD-CIIT [32]) 
and (b) ETEA. 

which has a double zero at z = r. The impulse response of system is (n + l)r”, when n ^ 0. It has 

the same shape as Figure [T^, which is a suitable model for the Type 0 artifacts in [25]. Therefore, R 2 X is 
sparse when x is composed of such piecewise smooth transients. Similar to (1221) . we formulate the problem 

X* =argmm|P 2 (x) = ||H(y - x )||2 + (^e([R 2 x]„)| (47) 

n 

to estimate the transient component x, so that the corrected data is y — x, and the low-pass trend / can be 
estimated by (|2T]l . Additionally, (jTTll can be easily solved by ETEA (in Table [2|) substituting R by R 2 . 

An example of complicated and irregular artifacts is the eye blink/movement artifacts in EEG data, which 
may have various morphologies and durations. In this case, we assume that the artifacts can be estimated by 
a continuous piecewise smooth waveform, generated by applying a certain sparse impulse sequence to i?^^(z) 
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(H51) . In other words, we broaden our signal model so that the artifact is not only equivalent to an isolated 
Type 0 transient as in [25) . but also a superposition of multiple such transients, with some freedom of scaling 
and shifting. The problem of estimating x can be solved by (|T7l) in this case as well. 

5.1 Example: Simulated data 

Figure [5] shows the simulated data and its lowpass and transient components. The transient component has 
several pulses with different heights and widths. They are obtained by feeding a sequence of impulses into 
system The hlter is given in (O with Ofc = [1, —2r,r^], fog = 1, and r = 0.950. 

Empirical mode decomposition (EMD) [^[53], a powerful method for analyzing signals, has been suc¬ 
cessfully utilized in different fields, including neuroscience, biometrics, speech recognition, electrocardiogram 
(ECG) analysis, and fault detection [8l llBl [2T1 l39l [54l [62l l65l l67] . 

As a comparison to the proposed approach, we use the EMD based denoising algorithm in (32] , which uses 
wavelet coefficient thresholding techniques on decomposed IMEs. More specifically, we use clear first iterative 
interval thresholding (CUT) with smoothly clipped absolute deviation (SCAD) penalty thresholding [311133] . 
with 20 iterations, and the result is shown in Eigure[3jD. In this example, in order to perform decomposition, 
among the entire eight IMEs, the thresholding is performed on IMEs 1-5, their summation is considered as the 
transients illustrated in Figure [3^, and IMFs 6-8 are considered as the estimation of the lowpass component. 
The EMD based method achieves a decent denoising performance, but does not accurately estimate the 
components. Eor instance, the simulated data has a smooth dip at about n = 700 (circled in Figure |9^). 
EMD decomposes it into higher IMEs since they are varying slowly, which degrades the estimation. We may 
group the lowpass and transient components differently to avoid this problem, for instance, grouping IME 
1-6 together in order to include more oscillations into transient component, but this causes other distortion, 
where the decomposed transient component contains a lowpass signal and does not adhere to a baseline of 
zero. 

The result obtained using second-order ETEA is illustrated in EigureUb. ETEA estimates both the low- 
pass and transient components well, and recovers the signal by a: -I- / precisely with RMSE = 0.87 (with the 
smoothed penalty function). The regularization parameter A for problem (l47l) was similar to (l44l) . 

Af«2.5a^||2h2||2, (48) 

where h 2 is the impulse response of system H'^{z)/R 2 {z), and R 2 {z) is defined in (l46ll . The decomposition 
is accurate. There are no compensating waveforms between the estimated x and / at n = 700, comparing to 
the estimation in Eigure|9^. 

Through numerical experiments, we found that second-order ETEA is not very sensitive to parameter r. 
EigureUt shows the RMSE of denoising the data in EigureHK, using r G [0.85,0.99]. In this test, all filter 
parameters (/c and d) are the same and A is set by (|48p . In most cases, the results are better than EMD-CIIT. 
In addition, r should not be too small or very close to I. If r has to be very small to yield a good estimation 
of component a;, it must fluctuate extremely rapidly, then it must be closer to a sequence of sparse spikes 
(i.e., Type 3 artifacts in [25]), which differs from the signal model. For such a signal, other algorithms are 
more suitable, e.g., LPF/CSD |60l|6T]. Similarly, if r has to be very close to I to fit the transients, then the 
transients must be very close to a piecewise linear signal, which is also not how we model the signal initially. 
As a consequence, we suggest to set r in the range 0.90 < r < 0.98. 
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Figure 10: Example 4: (a) EEG data with ocular artifacts, and corrected data using (b) ETEA and (c) 
MEMD. 

5.2 Example: ocular artifacts suppression 

In this example, we use ETEA with R 2 to correct EEG with eye blink/movement artifacts. Figure [TUh 
shows a 10 second signal from channel Fpl, with sampling rate fs = 256 Hz, downloaded from [^. As a 
channel located on forehead, Fpl is very sensitive to the motion of eyes and eyebrows, and in this example, 
eye movement artifacts with large amplitudes are present through the entire signal. Applying second-order 
ETEA with r = 0.95, the results for corrected data and extracted OA are illustrated in Figure ITOb. 

As a comparison, we use multivariate empirical mode decomposition (MEMD) [52] to correct the data. 
MEMD is a recently developed algorithm extending conventional EMD. It has been used in different aspects of 
EEG signal analysis and applications |9l|50l|49], including removing the ocular artifacts (OA) from multichan¬ 
nel EEG data |53l|42]. In this example, we use 4 EEG channels (Fpl, Fp2, G3, G4) measured simultaneously 
as the input, and decompose the higher-index IMFs (low-frequency subbands) considered to be artifacts |S31 
Section V]. More specifically, among all 14 IMFs decomposed in this example, we use IMF 1-4 as the corrected 
data, and the rest as artifacts. The corrected data and estimated OA are shown in Figure ITOh. 

From the results in FigureflOb and Figure ITOb. ETEA estimates artifacts more clearly than MEMD. In the 
MEMD result, some small-amplitude higher frequency oscillations leak into the artifact (e.g., about t = 5.5 
and 9.0 second). Moreover, some artifacts are introduced after applying MEMD method to correct the data. 
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Figure 11: Comparison of computation time with different signal length. 

(e.g., about t = 3.0 and 9.5 second in Figure ITUhl. Some oscillations are generated as transients where the 
abrupt artifacts occur. In contrast, in Figure ITUb. there are no oscillations introduced either in the estimated 
artifacts or the corrected data. 

Computational Efficiency. Figure [TT] shows the average computation time as a function of the signal 
length. In this experiment, we calculate the time of computation of ETEA and second-order ETEA (H71) 
with different filter settings (controlled by parameter d), using input signal lengths from 5000 to 10^. For each 
length, we average the computation time over 10 trials. For each trial, run 40 iterations of each algorithm. 
The experiment is implemented in Matlab 8.4 on a MacBook Pro 2012 with 2.7 GHz CPU. 

As shown, the proposed algorithms have a run time of order 0{N). Most of the computation time is 
consumed by the step of solving the linear system in Table [21 where Q is a banded matrix and we use 

a fast banded solver. 

Additionally, fast iterative shrinkage-thresholding algorithm (FISTA) [3l [7], which is an acceleration 
scheme for iterative shrinkage-thresholding algorithm (ISTA) [^, (a special formulation of MM) may be 
used to further accelerate the algorithm. 

6 Conclusion 

This paper proposes a new algorithm for denoising and artifact removal for signals comprising artifacts arising 
in measured data, e.g., neural time-series recordings, using sparse optimization. The first algorithm, ETEA, 
assumes the signal is composed of a lowpass signal and an exponential transients (Type 1). It is formulated 
as an optimization problem regularized by differentiable and smooth penalty function. The second algorithm 
is an extension of ETEA, using a higher-order recursive filter, which is applicable for correction of continuous 
protuberance transients (Type 0), and more irregular artifacts. As applications, we have shown that ETEA 
with different regularizers (R and R 2 ) are suitable for the suppression of Type 1 artifact in ECoG data and 
ocular artifacts (OA) (as sequential Type 0 artifacts) in conventional EEG data, with detailed comparisons 
to some state-of-the-art methods. Both of the above algorithms are computationally efficient because they 
are formulated in terms of banded matrices. A promising future work is to extend the above data correcting 
methods to multichannel data. 
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A Proof of Proposition [T] 

Proof. Substitute the variables u and v in (1141) by 

u = y/+ e, and v = \lP t. (49) 


Therefore, for x, z € R, the inequality holds: 


+ e) + ^ ^ f)\\/ + e) ^ f)(yj + e). 


2/ 


+ e 


(50) 


The right of the inequality is the majorizer of the smoothed penalty function f)[y/x^ + e). 

1) If z ^ 0, we can multiply z to the nominator and denominator of the first term on the left side of (ISDl) . 


^'(\/z^ + e) 
2y/P^e 


{x^ + e) = |- + 


y/ z'^ + ( 


e 




Vz^ + ( 


= + (51) 


Multiplying the nominator and denominator of the third term on the left side of (150)) by y/z^ P 
^ + e) + e) + 


2^2^ + e 

=I (<))'(Tz^Te) 


2y/z^~+l 






e 


cffy/^e) 


y/ z'^ + t 




(52) 


Using the results in (I5T|) and (I50|) . the inequality (150)) can be rewritten as 


|-<).((z) + (z) + Mz) - - y/Az) > 


(53) 


which can be reorganize into 


9e{x,Z-,(j)e) 


+ (j)Az) - \<AAz) ^ <fAx). 


(54) 


2) If z = 0 and x ^ 0, by Lagrange’s Mean Value Theorem [551 Theorem 5.10], since function (j){x) is 
continuous and 4>"{x) < 0 on x > 0, there exists a value f in the range y/e < f < y/x"^ + e that (f'{y/e) ^ 
^'{0 ^ 4>'Wx‘^ + e) satisfying 

(^y/x"^ + e- y/e'j = (/>(y/x^ + e) - (j){y/l). (55) 

Moreover, consider the square that is always positive 


[yjx'^ + e — yfeA = x^ + e — 2y/e{x‘^ + e) + e > 0, (56) 
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which implies the inequality 


> 2yJe(x^ + e) — 2e. 


(57) 


Furthermore, we can multiply both sides of (1571) by a positive term 
(EH), so that: 


2^7 


and then adopt the result from 


which leads to 



(58a) 

= </''(v^) + e- Vi) 

(58b) 

> ViO [Vx'^ +e- y/e) 

(58c) 

= (jiiVx"^ + e) - (t>{Vi), 

(58d) 

F > 4>{Vx'^ +e). 

(59) 


Because (j>e in m is differentiable on K, we can find its second-order derivative at zero, 






and by L’Hopital’s rule [SHI Theorem 5.13], we have 


lim 

z —>0 



lim 

^->•o 



2-\/e 


which implies that (1591) is in the same form of the majorizer (1541) at z = 0. 
3) If a: = z = 0, then the condition (El) follows immediately. 


(60) 


(61) 


□ 


References 

[1] M. T. Akhtar, W. Mitsuhashi, and C. J. James. Employing spatially constrained ICA and wavelet 
denoising, for automatic removal of artifacts from multichannel EEG data. Signal Processing, 92(2):401- 
416, 2012. 

[2] F. Bach, R. Jenatton, J. Mairal, and G. Obozinski. Optimization with sparsity-inducing penalties. 
Foundations and Trends in Machine Learning, 4(1):1-106, 2012. 

[3] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. 
SIAM J. Imag. Sci, 2(l):183-202, 2009. 

[4] E. J. Gandes, M. B. Wakin, and S. Boyd. Enhancing sparsity by reweighted 11 minimization. J. Fourier 
Anal. AppL, 14(5):877-905, December 2008. 

[5] M. Cetin and W.G. Karl. Eeature-enhanced synthetic aperture radar image formation based on non¬ 
quadratic regularization. IEEE Trans. Image Process., 10(4):623-631, April 2001. 


21 
















[6] A. Chambolle, R. A. De Vore, N.-Y. Lee, and B. J. Lucier. Nonlinear wavelet image processing: varia¬ 
tional problems, compression, and noise removal through wavelet shrinkage. IEEE Trans. Image Process., 
7(3):319-335, March 1998. 

[7] A. Chambolle and V. R. Dossal. On the convergence of the iterates of FISTA. hal-01060130, September 
2014. preprint. 

[8] C.-P. Chang, J.-C. Lee, Y. Su, P. S. Huang, and T.-M. Tu. Using empirical mode decomposition for iris 
recognition. Computer Standards & Interfaces, 31(4):729-739, 2009. 

[9] H.-C. Chang, P.-L. Lee, M.-T. Lo, Y.-T. Wu, K.-W. Wang, and G.-Y. Lan. Inter-trial analysis of post¬ 
movement beta activities in EEC signals using multivariate empirical mode decomposition. IEEE Trans. 
Neural Systems and Rehabilitation Engineering, 21(4):607-615, July 2013. 

[10] P.-Y. Chen and 1. W. Selesnick. Group-sparse signal denoising: Non-convex regularization, convex 
optimization. IEEE Trans. Signal Process., 62(13):3464“3478, July 2014. 

[11] R. R. Coifman and D. L. Donoho. Translation-invariant de-noising. In A. Antoniadis, editor. Wavelets 
and Statistics. Springer-Verlag Lecture Notes, 1995. 

[12] J. Dammers, M. Schiek, F. Boers, C. Silex, M. Zvyagintsev, U. Pietrzyk, and K. Mathiak. Integration of 
amplitude and phase statistics for complete artifact removal in independent components of neuromagnetic 
recordings. IEEE Trans. Biomed. Eng., 55(10):2353-2362, October 2008. 

[13] S. Durand and J. Froment. Artifact free signal denoising with wavelets. In Proc. ICASSP, 2001. 

[14] S. Durand and J. Froment. Reconstruction of wavelet coefficients using total variation minimization. 
SIAM J. Sci. Comput, 24(5):1754-1767, 2003. 

[15] M. Figueiredo, J. Bioucas-Dias, and R. Nowak. Majorization-minimization algorithms for wavelet-based 
image restoration. IEEE Trans. Image Process., 16(12):2980--2991, December 2007. 

[16] J. Fleureau, A. Kachenoura, L. Albera, J.-C. Nunes, and L. Senhadji. Multivariate empirical mode 
decomposition and application to multichannel filtering. Signal Processing, 91(12):2783-2792, 2011. 

[17] J.-J. Fuchs. On sparse representations in arbitrary redundant bases. IEEE Trans. Inform. Theory, 
50(6):1341~1344, 2004. 

[18] H. Gao. Wavelet shrinkage denoising using the non-negative garrote. Journal of Computational and 
Graphical Statistics, 7(4):pp. 469-488, 1998. 

[19] A. Gholami and S. M. Hosseini. A balanced combination of Tikhonov and total variation regularizations 
for reconstruction of piecewise-smooth signals. Signal Processing, 93(7):1945-1960, 2013. 

[20] C. Guerrero-Mosquera and A. Navia-Vazquez. Automatic removal of ocular artefacts using adaptive 
filtering and independent component analysis for electroencephalogram data. lET Signal Processing, 
6(2):99-106, 2012. 

[21] H. Huang and J. Pan. Speech pitch determination based on Hilbert-Huang transform. Signal Processing, 
86(4):792-803, April 2006. 


22 


[22] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N. C. Yen, C. C. Tung, and H. H. 
Liu. The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time 
series analysis. Proc. Roy. Soc. Lon. A, 454:903-995, March 1998. 

[23] N. E. Huang, M.-L. C. Wu, S. R. Long, S. S. P. Shen, W. Qu, P. Gloersen, and K. L. Fan. A confidence 
limit for the empirical mode decomposition and Hilbert spectral analysis. Proceedings of the Royal Society 
of London. Series A: Mathematical, Physical and Engineering Sciences, 459(2037):2317-2345, September 
2003. 

[24] D. R. Hunter and K. Lange. A tutorial on MM algorithms. Amer. Statist., 58:30-37, 2004. 

[25] M. K. Islam, A. Rastegarnia, A. T. Nguyen, and Z. Yang. Artifact characterization and removal for in 
vivo neural recording. Journal of Neuroscience Methods, 226(0):110-123, 2014. 

[26] C. A. Joyce, 1. F. Gorodnitsky, and M. Kutas. Automatic removal of eye movement and blink artifacts 
from EEG data using blind component separation. Psychophysiology, 41(2):313-325, 2004. 

[27] F. 1. Karahanoglu, 1. Bayram, and D. Van De Ville. A signal processing approach to generalized 1-d 
total variation. IEEE Trans. Signal Process., 59(ll):5265-5274, November 2011. 

[28] E. Kilic. Explicit formula for the inverse of a tridiagonal matrix by backward continued fractions. Applied 
Mathematics and Computation, 197(1):345“357, 2008. 

[29] E. Kilic and P. Stanica. The inverse of banded matrices. Journal of Computational and Applied Mathe¬ 
matics, 237(1):126-135, 2013. 

[30] S. Kim, K. Koh, S. Boyd, and D. Gorinevsky. trend filtering. SIAM Review, 51(2):339-360, 2009. 

[31] Y. Kopsinis and S. McLaughlin. Empirical mode decomposition based soft-thresholding. In 16th European 
Signal Processing Conference, 2008. 

[32] Y. Kopsinis and S. McLaughlin. Development of EMD-based denoising methods inspired by wavelet 
thresholding. IEEE Trans. Signal Process., 57(4):1351-1362, April 2009. 

[33] M. Kowalski. Sparse regression using mixed norms. Applied and Computational Harmonic Analysis, 
27(3):303 - 324, 2009. 

[34] 1. Kozlov and A. Petukhov. Sparse solutions of underdetermined linear systems. In W. Freeden et ah, 
editor. Handbook of Geomathematics. Springer, 2010. 

[35] K. Lange. Optimization. Springer New York, 2004. 

[36] K. Lange, D. Hunter, and 1. Yang. Optimization transfer using surrogate objective functions. J. of 
Comp. Graph. Statist., 9:1-20, 2000. 

[37] N. Mammone, F. La Foresta, and F. G. Morabito. Automatic artifact rejection from multichannel scalp 
EEG by wavelet IGA. IEEE J. Sensors, 12(3):533-542, March 2012. 

[38] D. Mandic. Empirical mode decomposition, multivariate EMD, matlab and data sources. 

[39] D. P. Mandic, N. U. Rehman, Z. Wu, and N. E. Huang. Empirical mode decomposition-based time- 
frequency analysis of multivariate signals: The power of adaptive data analysis. Signal Processing Mag¬ 
azine, IEEE, 30(6):74-86, November 2013. 


23 


[40] B. Molavi and G. A. Dumont. Wavelet-based motion artifact removal for functional near-infrared spec¬ 
troscopy. Physiological Measurement^ 33(2):259, 2012. 

[41] M. K. 1. Molla, M. R. Islam, T. Tanaka, and T. M. Rutkowski. Artifact suppression from EEG signals 
using data adaptive time domain filtering. Neurocomputing, 97:297-308, 2012. 

[42] M. K. 1. Molla, T. Tanaka, and T. M. Rutkowski. Multivariate EMD based approach to EOG artifacts 
separation from EEG. In Proc. ICASSP 2012, pages 653-656, 2012. 

[43] M. K. 1. Molla, T. Tanaka, T. M. Rutkowski, and A. Gichocki. Separation of EOG artifacts from EEG 
signals using bivariate EMD. In Proc. ICASSP 2010, pages 562-565, 2010. 

[44] K. Nazarpour, Y. Wongsawat, S. Sanei, J. A. Ghambers, and S. Oraintara. Removal of the eye-blink 
artifacts from EEGs via STF-TS modeling and robust minimum variance beamforming. IEEE Trans. 
Biomed. Eng., 55(9):2221-2231, 2008. 

[45] D. Needed. Noisy signal recovery via iterative reweighted 11-minimization. In Proc. Porty-Third Asilomar 
Conference on Signals, Systems and Computers, pages 113-117, 2009. 

[46] M. Nikolova. Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized 
least-squares. Multiscale Modeling and Simulation, 4(3):960-991, 2005. 

[47] X. Ning and 1. W. Selesnick. EGG enhancement and QRS detection based on sparse derivatives. Biomed- 
ieal Signal Processing and Control, 8(6):713-723, 2013. 

[48] B. Noureddin, P. D. Lawrence, and G. E. Birch. Online removal of eye movement and blink EEG artifacts 
using a high-speed eye tracker. IEEE Trans. Biomed. Eng., 59(8):2103-2110, 2012. 

[49] A. Omidvarnia, G. Azemi, P. B. Golditz, and B. Boashash. A time-frequency based approach for gen¬ 
eralized phase synchrony assessment in nonstationary multivariate signals. Digital Signal Proeessing, 
23(3):780-790, 2013. 

[50] C. Park, M. Plank, J. Snider, S. Kim, H. C. Huang, S. Gepshtein, T. P. Coleman, and H. Poizner. EEG 
gamma band oscillations differentiate the planning of spatially directed movements of the arm versus eye: 
Multivariate empirical mode decomposition analysis. IEEE Trans. Neural Systems and Rehabilitation 
Engineering, 22(5):1083-1096, September 2014. 

[51] R. M. Rangayyan. Biomedieal Signal Analysis - A Case-study Approach. IEEE and Wiley, New York, 
NY, 2002. 

[52] N. U. Rehman and D. P. Mandic. Multivariate empirical mode decomposition. Proceedings of the Royal 
Society A, 466(2117):1291-1302, 2010. 

[53] N. U. Rehman and D. P. Mandic. Filter bank property of multivariate empirical mode decomposition. 
IEEE Trans. Signal Process., 59(5):2421-2426, May 2011. 

[54] G. Rilling and P. Flandrin. One or two frequencies? The empirical mode decomposition answers. IEEE 
Trans. Signal Process., 56(l):85-95, 2008. 

[55] W. Rudin. Principles of mathematical analysis. McGraw-Hill, 1976. 


24 


[56] H. Sato, N. Tanaka, M. Uchida, Y. Hirabayashi, M. Kanai, T. Ashida, I. Konishi, and A. Maki. Wavelet 
analysis for detecting body-movement artifacts in optical topography signals. Neuroimage, 33(2):580-- 
587, 2006. 

[57] E. D. Schifano, R. L. Strawderman, and M. T. Wells. Majorization-minimization algorithms for nons- 
moothly penalized objective functions. Electron. J. Statist., 4:1258-1299, 2010. 

[58] I. W. Selesnick, S. Arnold, and V. Dantham. Polynomial smoothing of time series with additive step 
discontinuities. IEEE Trans. Signal Process., 60(12):6305-6318, December 2012. 

[59] I. W. Selesnick and I. Bayram. Sparse signal estimation by maximally sparse convex optimization. IEEE 
Trans. Signal Process., 62(5):1078-1092, March 2014. 

[60] I. W. Selesnick, H. L. Graber, Y. Ding, T Zhang, and R. L. Barbour. Transient artifact reduction algo¬ 
rithm (TARA) based on sparse optimization. IEEE Trans. Signal Process., 62(24):6596-6611, December 
2014. 

[61] I. W. Selesnick, H. L. Graber, S. Douglas, S. Pfeil, and R. L. Barbour. Simultaneous low-pass filtering 
and total variation denoising. IEEE Trans. Signal Process., 62(5):1109-1124, March 2014. 

[62] B. Tang, S. Dong, and T. Song. Method for eliminating mode mixing of empirical mode decomposition 
based on the revised blind source separation. Signal Processing, 92(l):248-258, 2012. 

[63] D. Wipf and S. Nagarajan. Iterative reweighted £i and £2 methods for finding sparse solutions. IEEE. 
J. Sel. Top. Signal Processing, 4(2):317-329, April 2010. 

[64] Y. Wongsawat. Efficient implementation of RMVB for eyeblink artifacts removal of EEG via STE-TS 
modeling. In Proc. ROBIO 2008, pages 1567-1572, 2008. 

[65] X. Xie. Illumination preprocessing for face images based on empirical mode decomposition. Signal 
Processing, 103(0):250-257, 2014. 

[66] M. Yaghoobi, T. Blumensath, and M. E. Davies. Dictionary learning for sparse approximations with the 
majorization method. IEEE Trans. Signal Process., 57(6):2178-2191, June 2009. 

[67] J. Yan and L. Lu. Improved Hilbert-Huang transform based weak signal detection methodology and its 
application on incipient fault diagnosis and EGG signal analysis. Signal Processing, 98:74-87, 2014. 

[68] S. Yu, A. S. Khwaja, and J. Ma. Compressed sensing of complex-valued data. Signal Processing, 
92(2):357-362, 2012. 

[69] H. Zeng, A. Song, R. Yan, and H. Qin. EOG artifact correction from EEG recording using stationary 
subspace analysis and empirical mode decomposition. Sensors, 13(11):14839-14859, 2013. 


25 


